-
Notifications
You must be signed in to change notification settings - Fork 25.4k
Push compute engine value loading for longs down to tsdb codec. #132622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Push compute engine value loading for longs down to tsdb codec. #132622
Conversation
This is the first of many changes that pushes loading of field values to the es819 doc values codec in case of logsdb/tsdb and when the field supports it. This change first targets reading field values in bulk mode at codec level when doc values type is numeric doc values or sorted doc values, there is only one value per document, and the field is dense (all documents have a value). Multivalued and sparse fields are more complex to support bulk reading for, but it is possible. With this change, the following field types will support bulk read mode at codec level under the described conditions: long, date, geo_point, point and unsigned_long. Other number types like integer, short, double, float, scaled_float will be supported in a followup, but would be similar to long based fields, but required an additional conversion step to either an int or float vector. This change originates from elastic#132460 (which adds bulk reading to `@timestamp`, `_tsid` and dimension fields) and is basically the timestamp support part of it. In another followup, support for single valued, dense sorted (set) doc values will be added for field like _tsid. Relates to elastic#128445
optimize sparse loading.
… case didn't add measurable performance benefits.
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
Pinging @elastic/es-analytical-engine (Team:Analytics) |
Hi @martijnvg, I've created a changelog YAML for you. |
…ngleton_dense_long_loading
/** | ||
* Specialized builder for collecting dense arrays of long values. | ||
*/ | ||
interface SingletonBulkLongBuilder extends Builder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The plan is to reuse this builder interface for other number field types too and even ordinal based fields. Given that at the codec level everything is stored as long[]. For other non long field types we need a conversion step, but that can happen in the build()
method. For example converting to int[] by using Math.exactInt(...)
, which can be done a simple loop in build()
method. So I don't expect us to introduce more interfaces here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need a SingletonInt instead for ordinals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think with generalizing the builder that makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow - more than a 5x speedup, impressive! Great changes; however, I think we should make them less invasive and more contained. Thanks, Martijn! I'm looking forward to seeing this PR merged.
server/src/main/java/org/elasticsearch/index/codec/tsdb/es819/BulkNumericDocValues.java
Outdated
Show resolved
Hide resolved
bulkReader = new BulkReader() { | ||
|
||
@Override | ||
public void bulkRead(BlockLoader.SingletonBulkLongBuilder builder, BlockLoader.Docs docs, int offset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be more consistent to implement BlockLoader.Block read(BlockFactory factory, Docs docs, int offset) throws IOException
instead.
} | ||
|
||
private static class SingletonLongs extends BlockDocValuesReader { | ||
static class SingletonLongs extends BlockDocValuesReader { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we enable the optimization in BlockLoader.Block read(BlockFactory factory, Docs docs, int offset)
of this class only?
public BlockLoader.Block read(BlockFactory factory, Docs docs, int offset) throws IOException {
if (numericDocValues instanceof ... r) {
return r.read(factory, docs, offset);
}
...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that should work as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took me a while, but this works out and it is now even simpler! 054b12e
@@ -1013,7 +1013,8 @@ public Function<byte[], Number> pointReaderIfPossible() { | |||
@Override | |||
public BlockLoader blockLoader(BlockLoaderContext blContext) { | |||
if (hasDocValues()) { | |||
return new BlockDocValuesReader.LongsBlockLoader(name()); | |||
var indexMode = blContext.indexSettings().getMode(); | |||
return new BlockDocValuesReader.LongsBlockLoader(name(), indexMode); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to pass indexMode? I think we can always enable optimizations if the underlying doc_values are dense and use our codec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point we can just check the implementation of numeric doc values. This should be sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pushed: be0c77c
/** | ||
* Specialized builder for collecting dense arrays of long values. | ||
*/ | ||
interface SingletonBulkLongBuilder extends Builder { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need a SingletonInt instead for ordinals.
@@ -498,6 +509,14 @@ interface IntBuilder extends Builder { | |||
IntBuilder appendInt(int value); | |||
} | |||
|
|||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename this to SingletonLongBuilder and add support for appending a single long? I think we can use this builder when doc_values is dense, even if it's not from our codec. Also, we should consider extending LongVectorFixedBuilder to support bulking, but it's not an issue of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed the class: 2924c402368fd58bc13adea5a943d8afa2fda963
I think we can use this builder when doc_values is dense, even if it's not from our codec.
I think so too, we would need to check by: numericDocValue#cost() == maxDoc
in BlockDocValuesReader.SingletonLongs
?
Also, we should consider extending LongVectorFixedBuilder to support bulking, but it's not an issue of this PR.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed f097f4a, to use singleton long builder when we're dense even when not using es819 doc value codec.
…dec and field is dense.
…ngleton_dense_long_loading
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left more comments, but we're close. Thanks, Martijn!
if (numericDocValues.advanceExact(doc)) { | ||
if (numericDocValues instanceof BulkNumericDocValues bulkDv) { | ||
return bulkDv.read(factory, docs, offset); | ||
} else if (isDense) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's unsafe to use cost for this. Would you mind reverting this part? We can find a way to enable it later. We need to do something similar to FieldExistsQuery#rewrite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I will revert.
I see that FieldExistsQuery#rewrite(...)
relies on Terms#getDocCount()
and for that we need an inverted index for that same field. I think doc value skippers can also be used. But let's figure this out in another change.
int remainingBlockLength = ES819TSDBDocValuesFormat.NUMERIC_BLOCK_SIZE - blockInIndex; | ||
for (int newLength = remainingBlockLength; newLength > 1; newLength = newLength >> 1) { | ||
int lastIndex = i + newLength - 1; | ||
if (lastIndex < docsCount && isDense(index, docs.get(lastIndex), newLength)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this logic! Can we limit remainingBlockLength to the min of (ES819TSDBDocValuesFormat.NUMERIC_BLOCK_SIZE - blockInIndex, docsCount - i)
to allow a single copy of the last block? Note that there could be an issue with this logic for Lookup Join and Enrich, as the same doc IDs can appear multiple times. For example, this logic might mistakenly treat [1, 1, 2, 4] as [1, 2, 3, 4]. However, both Lookup and Enrich indices don't use this codec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we limit remainingBlockLength to the min of (ES819TSDBDocValuesFormat.NUMERIC_BLOCK_SIZE - blockInIndex, docsCount - i) to allow a single copy of the last block?
Let me try this.
Note that there could be an issue with this logic for Lookup Join and Enrich, as the same doc IDs can appear multiple times. For example, this logic might mistakenly treat [1, 1, 2, 4] as [1, 2, 3, 4]. However, both Lookup and Enrich indices don't use this codec.
I will add a comment about this here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pushed: 6ca5c66
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove lastIndex < docsCount
check?
public BlockLoader.SingletonLongBuilder appendLongs(long[] newValues, int from, int length) { | ||
try { | ||
System.arraycopy(newValues, from, values, count, length); | ||
} catch (ArrayIndexOutOfBoundsException e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leftover?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, for easy debugging :)
return docs[i]; | ||
try { | ||
return docs[i]; | ||
} catch (ArrayIndexOutOfBoundsException e) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leftover?
…value codec and field is dense." This reverts commit f097f4a.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for all iterations @martijnvg!
docs/changelog/132622.yaml
Outdated
@@ -0,0 +1,5 @@ | |||
pr: 132622 | |||
summary: Add bulk loading of dense singleton number doc values to tsdb codec and push compute engine value loading for longs down to tsdb codec |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the summary doesn't match with the PR title?
This is the first of many changes that pushes loading of field values to the es819 doc values codec in case of logsdb/tsdb and when the field supports it.
This change first targets reading field values in bulk mode at codec level when doc values type is numeric doc values or sorted doc values, there is only one value per document, and the field is dense (all documents have a value). Multivalued and sparse fields are more complex to support bulk reading for, but it is possible.
With this change, the following field types will support bulk read mode at codec level under the described conditions: long, date, geo_point, point and unsigned_long.
Other number types like integer, short, double, float, scaled_float will be supported in a followup, but would be similar to long based fields, but required an additional conversion step to either an int or float vector.
This change originates from #132460 (which adds bulk reading to
@timestamp
,_tsid
and dimension fields) and is basically the timestamp support part of it. In another followup, support for single valued, dense sorted (set) doc values will be added for field like _tsid.Relates to #128445
Given that the optimization is target to specific doc value fields that are produced by long field mappers, I experimented with the following query:
FROM metrics-hostmetricsreceiver.otel-default | STATS min(@timestamp), max(@timestamp)
.The
metrics-hostmetricsreceiver.otel-default
contains 270 minutes of metrics and has221184000
docs and storage is8.5gb
. On my local machine, the query time without this change is ~180ms and with this change ~70ms.Flamegraph without this change:

ESQL profiling of the query (and
data_partitioning
set toshard
) without this change:Flamegraph with this change:

ESQL profiling of the query (and
data_partitioning
set toshard
) with this change: